A PROOF OF SMALE'S MEAN VALUE CONJECTURE 



GERALD SCHMIEDER 
Abstract. A proof of Smale's mean value conjecture from 1981 is given. 

Connected with his investigations on the complexity of determining polynomial roots 
by Newton's method, Steve Smale [2j considered difference quotients 

where p is a non-constant polynomial, p'(() = and z ^ ( an arbitrary complex 
number. He asked for some universal (i.e. valid for all such polynomials and all 
z 7^ constant K such that |D(C> z)\ < K\p\z)\ for at least one derivative zero C 
He proved in [2], using results on univalent functions, that this is true for K = 4 
and conjectured K = 1 to be best possible. 

Obviously one may without loss of generality assume that z = and p(0) = 0. Then 
the question is to estimate the number min { - — — : p'(Q — 0j. Note that the 

Cp (0) 

conjecture trivially holds for polynomials of degree one. 

The conjectured bound 1 can be sharpend a little bit if we consider only polynomials 
of a fixed degree. Here we will prove the following: 

Let p G C [z] be a polynomial of degree n > 1 with p(0) = and p'(0) ^ 0. Then 

p(C) 



mm 



Cp'(o) 



AC) = o < 



n — 1 



Equality only occurs for p(z) = a\Z + a n z n with arbitrary a±, a„£C \ {0}. 

Let n > 1 be fixed and define T n as the class of nth degree monic complex poly- 
nomials p with p(0) = 0, p'(0) 7^ and p(Q ^ for all derivative zeros ( of p. 
Obviously it suffices to consider polynomials p G T n in order to give a proof of 
Smale's conjecture. For such p we define 

p(C) 



P0,C) := 



0/(o) 



and the associated number as 

p(p) := min{p(p,C) : p'(C) = °) 
The zero Co of p' is essential if 

p(Co) 



P(p) 



CoP'(O) 

Note that a polynomial may have more than one essential derivative zero. We call 
p G T n simple if p"(() ^ for all essential derivative zeros ( of p. 
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A polynomial po G T n is maximal if p(p) < p(po) for all p G T n - Below we will 
determine the maximal polynomials in T n . In the following we will prove: 

Theorem 1. For each p G T n there exists some q G T n which zeros W2, ■ ■ ■ , w n 7^ 
have the same modulus and it holds p(q) > p(p). 

In order to prove theorem ^ we may without loss of generality assume that \zj\ < 1 
holds for the zeros z 2) . . . , z n 7^ of p, and equality is taken for at least one of them. 
Otherwise we consider the polynomial s n p(z/s) with s = maxjl^l, • • • , \ z n\}- The 
associated number of this polynomial is the same than those of p. Moreover we may 
assume that \z„\ < 1. 



If p G T n is a polynomial with the zeros z 2 , ■ ■ ■ , z n besides and the derivative zero 
£ with p(() 7^ 0, then 



As explained we may provide that \zj\ < 1 for j = 2, . . . ,n and \z n \ < 1. We let 
Z2, ■ ■ ■ , z n -i be fixed and vary z n , i.e., we consider the polynomials 



We assume for the moment that £ is a zero of p', but not a zero of p". The implicit 
function theorem (cf. ^Q) shows the existence of a holomorphic function ((u) with 

((z n ) = ( and ^-(((u),u) = 0, defined in a neighborhood of z n . If we move u 

oz 

along a path 7 in C starting in 7(0) = z n then we have an unrestricted analytic 
continuation of C(7(^)) if ^^(C(l(t)),j(t)) 7^ for all t. If the path would meet 
these exceptional points, we would have at least a continuation of C(7(^)) which is 
at least continuous in such points. Note that the values of C(7(^)) with respect to 
this continuation move on the Riemann surface R, which is defined by the equation 
Q'(z,u) = (derivative with respect to z). We will discuss this surface in section 2. 

It comes out (Q' denotes the derivative of Q with respect to z) 



1. The basic idea 



(1) 




n-1 



(2) 



Q(z, u) = (z - u) z \\{z - Zj) = (z -u) q(z). 
3=2 




n-1 



(3) 




Note that \np(Q(.,u), C(«)) 



mogg(u,({u)). 
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Let j)£f fl and ( be a (not necessarily essential) derivative zero of p. As above let 
0, z 2 , ■ ■ ■ , z n G E be the zeros of p and |z n | < 1. If 7 : [0, 1] — > C is a path with 
7(0) = z n ,7(l) = u we see 

d 



dt 



lnp(Q(. )7 (f)),C(7(*))) 



^tog g(7 (t),C(7(*))) = » ^^ 

^ ^(tw.C(tw)) 



Note that C( 7 (^)) depends on the path 7. So we have 
lnp(Q(.,u),C(u)) -lnp(p,C) 



1 d_ 
dt 



In/( 7 (t),C(7(*)))*=» 



1 d_ 



logs(7(t),C(7(t)))dt. 



The integrand can be calculated as 



jb(7(t),C(7(t))) 



^(7(t),C(7(t))) 
and this leads to 

In /,((?(.,«), C(«)) -In p(p,f) = » 



(C(7W - 7(*)))7W 9 



The right hand side can be written as 

-1 C(v 



3? 



C(«)-u v 



+ c» 



-7777 r~ + C w-(Cw) 



+ -(C(«)) )'/'■• 



From (P) we obtain 







<HC(v),v) = _L + _}_ + l m) . 



It comes out 

lBp(Q(.,u),C(w))=lBp(p,C)-» 
and therefore 

(4) p(g(.,«)),c(u)) = 

exp 

c 



p(p. 
p(p,0 ■ 



C(v)-v v C(v) 



C(v) -v 



exp 



C(v)-v 



dv 



2. The Riemann surface R 
The Riemann surface R of the derivative zeros of Q is given by the equation 
(5) Q\w) = q(w) + {w- u)q\w) = 0. 

This (actually compact) manifold R consists of the points w (which are the derivative 
zeros of Q(.,u), and the equation gives local uniformizations of R, if the derivative 
of u — <f(w) := w + ^(w) with respect to w does not vanish (note that these branch 



points are also described by 



d 2 Q, 



w, u) 



0). So the points w where 2q'(w) 
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q(w)q"(w) are branch points of the surface, this branch points play in fact no 
special role on the Riemann surface, their appearance depend on the special local 
coordinates, which are given by the defining equation (example: the surface of the 
square root is defined by w 2 = u with as a branch point; if we add this point, 
it is conformally equivalent to the plane resp. C). They can actually added as 
"normal" points to the surface and have simply connected neighborhoods on which 
local coordinates can be found. 

R, as a compact surface, may be regarded as a (n — l)-sheeted covering of C, and p 
gives a canonically projection R — > C. 
We define 

(6) /(«, COO) := 7TT ex P (~ [ 7T^- ■ — dv )> 

Cw V J lu C{v) -v v J 

where y u : [0, 1] -> C with y u , (0) = z n ,y u (l) = u and C(t«(°)) = Co (some fixed 
derivative zero of p), C(7«(l)) = C(u). By (jlj) we have 

(7) p(Q(.,u)),((u)) = p(p,(o)-\f(u,((u))\. 

f is. up to isolated singularities, a holomorphic function on R, because it has this 
property in the local coordinate «6C (the case n = oowe discuss separately). The 
holomorphy is not obviously clear in the following cases. 

(i) Q'(0, Mo ) = g(0, Mo ) = 0, or 

(ii) Q(wi,ui) = (this includes the case u — ((u)), or 

(iii) 2q'{w 2 ) 2 = q{w 2 ) • q"{w 2 ) (branch points) 

(in case of (i) or (ii) the polynomial Q does not belong to the class jF n ). We discuss 
this three cases. 

Case (i): The polynomial p has only simple zeros. So Q'(0, «o) = is only possible if 
u = 0. A direct calculation gives that p(Q(., 0)) = p(Q(., 0), 0) = -. Thus / is has a 

removable singularity in u = if ((u) = 0. If p(w Q ) = 0, but w ^ (and thus is not 
essential for Q(., 0)) then we see that / has a pole in w , because p(Q(., w), w) — > oo 
if w — > w . 

Case (ii): The assumption implies that Q(.,ui) has a multiple zero in the point w\. 
This is only possible if u\ is one of the zeros z 2 , ■ ■ ■ , £ n -i of p (u = has already 
been discussed) and U\ = W\. By the definition we see that p(Q(.,Ux),wi) = 
and p(Q(.,Ui),w) > if <p(w) = U\ and w ^ w\. So these singularities of / are 
removable. Moreover we have p(Q(.,Ui),Wi) = in this case. 

Case (iii): If 2g'(u> 2 ) 2 = q{w2)q'' '(^2), then w% {0, z 2 , . . . , z n -i}, because q has only 
simple zeros in these points. © shows that / is bounded in a neighborhood of the 
branch point w 2 on R. Again we conclude that / has a removable singularity in this 
case. 

We summarize: 

Lemma 1. The function f as defined in (0) is meromorphic on the Riemann surface 
R! := {w G R : (p(w) G C}. It has poles exactly in the points w G R! with <p(w) = 
and w^O. The zeros of f are the points w G R' with w = <p(w) G {z 2 , . . . , 2 n -i}- 
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We can give an alternative representation of /. It holds p(Q(., u), C{ u )) — I ^u)Q'\o,u) \' 
From (J7J) we obtain that f(u, C( u )) equals ^Q'lou) ' U P ^° a possible factor of 
modulus one. For u = z n we see that this factor is one. By (JHJ) we receive the 
representation: 

(r) h„ t<„w - ^Cog^Co) g(C(^)) 2 

\°) J{ U A{ U )) — TTTo 77 \ 7777 \v 

<?(Co) 2 u((u)q'(((u)) 

Finally we investigate the structure of R close to u = oo. The point infinity is 
no branch point of R, because the function l/ip(l/w) has in w = the expansion 
w(2=± + aiw + ...). 

For u (E E all zeros of Q(.,u) are contained in P. By the Gaufi-Lucas theorem we 
know that the zeros of the derivative Q'(z,u) = ^(z,u) lie in the convex hull C of 
the zeros. They are inner points of C with the only exception of multiple zeros of 
Q. None of these derivative zeros in our case is of bigger order than 1. So the same 
argument gives that the zeros of the second order derivative Q"(z,u) = ^-(z,u) 
are points the open unit disk E. So the same is true for the branch points of R. To 
be more precise, all branch points w of R fulfill |y(w)| < 1. 
The subset D\ of R with ip(Di) = E therefore contains all branch points. 
As a consequence, the complement R \ D\ (including oo) consists of n — 1 simply 
connected domains G%, . . . , G n -i- Let ((u) be the function which is defined on 
Gk with respect to a fixed start point Co with y?(Co) = z n . Then the mappings 
$A: := <f\Gk = (p\Gk '■ Gk — > {u G C : \u\ > 1} are conformal. 

The boundaries of the domains Gj are pairwise disjoint. Each dGj is mapped 
homeomorphically by tp on the unit circle. 

It holds P(z,u) := ^ (z '"- > = (- — l)q(z). The derivative zeros of P with respect 
to z are the same as those of Q. For u —>■ oo the polynomials P(z, w) tend locally 
uniformly to q(z). So, in this case, tends to oo on one Gfc, let us say on Gi. 
For k — 2, . . . , n — 1 it follows that each Q(u) G G& tends to some derivative zero 
of q 1 if w — > oo. 

2.1. £(tt) on G*i. From (jHJ) we see that ((u) has a pole in oo G G\. The equation 

« ?(C(«)) 



C(u) ({u)q>(C(u)) 
ZnCoq'(Co) q(((u)) 2 z n ( q>(( ) q(C(u)) ((u) q(((u)) 



gives that 77F- — > 2+1. It holds 

(,(«) n 



/(«,C(«)) 



<?(Co) 2 uC(u)q'(C(u)) g (Co) 2 CfaMCOO) « C(«) ' 

All fractions stay to be finite (and non zero) for u —>■ 00, except of the last one, 
which has a pole of order n — 2 in 00, and so / has. 

2.2. ((u) on Gk for k > 1. In this cases £(tt) tends to some derivative zero of 
g'. From 

= ((u)q'(((u)) - uq'(C(u)) + q(((u)) 
we conclude that uq'(((u)) — > q(£k) if w — > 00. Now we see from (JSJ that / is 
holomorphic in oo fc G Gk, and := /(oofc) = ^"3^)2^ - g(£fc)- Thus / is holomorphic 
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on Gk- Moreover / does not vanish in Gk, because the zeros of q are all in E. But 
on the boundary (as well as on the boundary of G\) there will be some zero, which 
comes from the zero(s) of p on the unit circle. 

3. Blowing up and pulling back 

Let r > and p r (z) = r n p{z/r). If we start the considerations of the preceding 
section with p r instead of p we have to replace the zeros z%, . . . , z n of p by rz 2 , . . . , rz n 
and the derivative zeros ((u) by r((u) as well as q(z) by r n ~ 1 q(z/r). The variation 
is then 

n-l 

Q r (z, u) := r n Q(z/r, u) = z{z — ur) ■ r n ~ 1 q{z/r) = z{z — ur) \\{z — Zjr). 

i=2 

Note that the zeros of Q r (., u) are the points ru, rz 2 , ■ ■ ■ , rz n , and it has the derivative 
zeros r((u), where ((u) denotes those of Q(.,u). 

As already mentioned we have p(p r ) = p(p) for all r > 0. Let Uq be some complex 
number of modulus r. If r is large enough me may provide that 

\f(ru (r),r((u (r)))\ > \c k \/2 

if r((u (r)) G G 2 , ■ ■ ■ , G n -i- If r C( M o( r )) G Gi we ma y, because of the pole of / in 
ooi G Gi, assume that \f(ruo{r), r£(uo(r))) \ > 1. Now © and (JJJ) show 

p(Qr(-, u),rC(u)) = r ■ f(ru, r((u)). 

So p(Q r (., %(r)), r((uo(r))) > p(p,(o) f° r a h sufficiently large r and all derivative 
zeros of this polynomial. If Co has been taken above as an essential derivative zero 
of p this says that 

p(Q r (.,uo(r)),((u (r))) > p(p) 

for all derivative zeros r((uo(r)) of Q r (.,Uo(r)). This gives, together with the remark 
above, p(Q r (-,u (r))) > p(p r ) = p(p). 

The polynomial Q r {-,uo{f)) has all its zeros in \z\ < r and one zero more on the 
boundary of this disk than p have on the unit circle (namely Uo(r), in which z n has 
been changed). By p*(z) := r~ n Q r (zr,uo(r)) we pull all the zeros back into the 
closed unit disk and so we found some polynomial, which has one zero more on the 
unit circle as p and which fulfills p(p*) > pip)- 

We can repeat this argument until we obtain a polynomial vanishing only on the 
unit circle and which associated number is bigger than that of p. This finishes the 
proof of theorem 

4. Proof of Smale's conjecture 

It remains to compare p(p) for polynomials p(z) = zY[™ =2 i z ~ z j) with \z 2 \ = 
. . . \z n \ = 1. For such polynomials Smale's conjecture has already been proved by 
Tischler He also determined the maximal polynomials for this subclass of T n 
as p(z) = a\z + a n z n with ai, a n G C \ {0}, and we have the result that these are 
indeed the only maximal polynomials in T n . 
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